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Abstract. We consider the complexities of substitutive sequences over a bi¬ 
nary alphabet. By studying various types of special words, we show that, 
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a recurrence formula determined by the characteristic polynomial. 
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1. Introduction 

The study of substitutions over a finite alphabet plays important roles in many 
fields such as finite automata, symbolic dynamics, formal languages, number the¬ 
ory, fractal geometry etc. It has various applications to quasi-crystals, compu¬ 
tational complexity, information theory... (see [D S d El CQ] and the references 
therein). In addition, substitutions are also fundamental objects in combinatorial 
group theory mm- 

Given an infinite sequence ^ G A) over some finite alphabet A, 

we denote by £n(0 the set • • • ^i+n-i | ^ > 1} of factors of ^ of length n {n > 1), 
and by convention Ao(0 is the singleton consisting of the empty word £. The 
set £(^) = Un>oAn(0 is then called the language of and the function p^{n) : = 
the complexity of here and hereafter ^ denotes the cardinality of a 
finite set. 

Let A* be the free monoid generated by A (with £ as the neutral element). A 
morphism a : A* —)■ A* is called a substitution. We deal with only the non-erasing 
substitutions (the image of any letter in A is not the empty word), whence the 
substitution can be extended naturally to A^, the set of infinite sequences over 
A. Denote by any one of the fixed points of a (that is cr(^a-) = ^o-), if it exists. 

The study of the complexity of (also called the complexity of a) has a long 
history. In general, it is very difficult to hnd out the explicit formula for p^{n) 
for a given a; only some calculations for specific classes of substitutions can be 
found in the literature. Here are some known results : 

• P^{n) < n for some n if and only if ^ is ultimately periodic, and in this 
case the complexity is bounded [T3] : 
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• A sequence ^ of complexity p^{n) = n + 1 is called Sturmian. There are 
many equivalent characterizations and interesting properties of Sturmian 
sequences (see, e.g. 0[18lE2]); 

• Rote DU constructed a class of sequences with complexity 2n by using 
graphs; 

• Mosse [H] studied the case of g-automata (which correspond to substitu¬ 
tions of constant length). A method to compute p{n) with linear recur¬ 
rence formula was given under some technical conditions; 

• Over a ternary alphabet, a class of Tribonacci type substitutions with 
complexity 2n -|- 1 was introduced by Arnoux and Rauzy [3] . An example 
of substitution (Triplex Substitution) with complexity 3n is presented by 
the authors PT| . 

• For a fixed point of some substitution, the complexity can only be of the 
following hve different asymptotic forms: 0(1), 0(n), 0(n log log n), 0(?7, logn) 
or 0(n^), where Q{g{n)) means a function /(n) satisfying 0 < liminf < 

lim sup < oo [15] . 

• For a survey and more general computation of factor complexity of word 
(on a alphabet of cardinality more than 2), we suggest to see [6] [8]. 

In this paper, we consider general substitutions a over a binary alphabet. Using 
Mosse’s theory of identifiability ([E]) and by studying various types of special 
words (151 i), we show that the complexity p{n) can be completely formulated 
knowing some initial values, and a recurrence formula is given. 

2. Notations and Preliminary 

We fix the binary alphabet A = {a, b} consisting of two letters a and b. Let 
A* be the free monoid generated by A (with the empty word e as the neutral 
element), and A^ be the set of all infinite sequences (also called infinite words) 
over A. 

If tc G A*, we denote by |r(;| its length and by |tc|a (resp. |r(;|fe) the number of 
occurrences of the letter a (resp. b) in w. The abelian Parikh vector of w is then 
defined to be the column vector L{w) = (|ry|a, G 

A word u is a factor of a word w (written as u G tc) if there exist u, u' G A*, 
such that w = uvu' . It is sometimes convenient to use the notation “®” to stand 
for some word which we don’t care so much. Thus u is a factor of a word w if 
and only if w = @u® (remark that even within a formula, ®’s may represent 
different words). We say that u is a prehx (resp. a suffix) of tc if tc = v® 
(resp. w = @u), and then we write u < w (resp. v\>w). Two words v and w are 
said to be comparable, written u ixi tc, if either v \> w oi w\> v. The notions of 
factor and prefix extend to infinite words in a natural way. 

It is also convenient to put, e.g. A*u := {xv^x G A*}, A*uA* := {xvy]x,y G 
A*}, etc. Thus w G A*uA* ^ v E w] w E vK* ^ v <\w, and so on. 
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When ^ = wi - ■ ■ Wm • • • G A* U {wi G A), we also write .^|i = Wi, • • • , = 

• • •, and = WiWi+i ■ --Wjii < j). 

As already defined, a substitution a over A is a morphism a of A*. The matrix 
M = (L(cr(a)), L(cr(6))) is called the incidence matrix of a. The characteristic 
polynomial A^—tr(M)A+det(M) of M is also called the characteristic polynomial 
of a. 

If a{a) and a{b) have distinct first letters, we say that the substitution a is 
marked, and if moreover a{a) = a® and a{b) = b®, we say that a is well-marked. 
It is easy to see that is well-marked if a is marked. 

In this paper, all substitutions are assumed to be non-erasing, that is, the 
image of each letter is not empty. Whence, the substitution can be extended 
naturally to A^. An infinite word ^ = ^ 1^2 • • • is a fixed point of a if cr(.^) = 

Hereafter, we suppose that the substitution a is primitive (i.e. its incidence 
matrix M is primitive: M" possesses positive coordinates for some positive integer 
n). The following easy facts for a primitive substitution a are well known: 

(1) the fixed point of a is recurrent, that is, every factor will occur for infinitely 
many times; and all the fixed points of a have the same language; 

(2) a substitution a and its powers a” (n > 1) have the same fixed points, 
and thus have the same language; 

(3) if one substitution is a composition of an inner automorphism (of the free 
group) with another substitution, then the two substitutions have the 
same language. 

We suppose also that the fixed point ^ of cr is not (ultimately) periodic; the pe¬ 
riodic case are characterized completely by Seebold [T9j. In particular, whence 
{(j(a), cr(6)} is a code, and thus a is marked up to an inner automorphism (see 
0)- For the sake of calculation of the complexity of a non-periodic primitive 
substitution, we may further suppose, without loss of generality, that the substi¬ 
tution is well-marked. 

The notion of “special words” is a powerful tool for calculating the complexity. 
See O E] and glElIlQ] for more information. 

Let VF be a factor of If 5 G A such that Wb is a. factor of then we say that 
W5 is a right extension of W. A word is called a right special word (special word 
for short) of ^ if it has more than one extensions, that is, Wa G ^ and Wb G 
Similarly we define “left extension” and “left special word”. It is easy to see that 
a suffix (resp. prefix) of a special (resp. left special) word is also special (resp. 
left special). 

Let Sn (resp. CSn) be the set of special words (resp. left special words) of 
length n of Put S = U„>oiS„ (resp. CS = U£iS„). It is easy to see that 

s{n) ■= #Sn = = Ap{n + !)(:= p{n + 1) - p{n)). 

Hence the study of p{n) is almost equivalent to the study of s{n). 
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2.1. The word Wq and the letters 6a, Sf,. 

Write A = a{a),B = cr(6), and denote {A^B}* the set of words obtained by 
a finite concatenation of the words A and B. Pnt, as before, e.g. {A, BY A : = 
{VA-,V G {A,B}*}. Remark that since a is non-periodic, {A,B} is a code and 
{A, B}* is a disjoint nnion of {A, B}*A and {A, B}*B. 

Since a is non-periodic, the left-infinite words d°°(= • • • AA ■■■ A) and B°° are 
different. Let Wq be the longest common snffix of and B°° (see also |20jL 
Remark that Wo is possibly empty. 

The following lemma is a direct conseqnence of Fine-Wilf theorem [T6] . 
Lemma 2 . 1 . |hFo| < |d| -|- \B\ — 2. 

By the definition of Wq, for some 6a, 6b G {a, b} with {ha, hf,} = {a, b}, 

(2.1) = ®6aWo and B°° = ®6bWo. 

Formnla fl2.1l) shows that there exist m > 0 and A' > A (|W| < |d|) snch that 

(2.2) Wo = A'A^, and 6aA' > A, 
and similarly 

Wo = R'R^ and 6bB'>B. 

The following lemma is essentially dne to |2Uj . 

Lemma 2 . 2 . flj For W G {A,B}*, we have Wq txi W. Furthermore, 

(2) IfW G {A,B}*A (resp. {A,B}*B ) and |W| > |Wo|, then 6aWo>W (resp. 
6bWo \>W), where 6a and 6b are defined in 

(3) Let W G {A,B}*. If 6 a Wo > W (resp. Wo > W), then W G {d,R}M 
(resp. {A,B}*B). 

In brief, any word in {A,B}* is comparable with Wo. Amongst them, the word 
in {A,BYA is comparable with 6aWo and {A,BYB is comparable with 6bWo. 

Proof. VlW = A oiW = B, the lemma is obvious. Suppose W G {A, BY such 
that Wo CXI W, we claim that 6aWo cxi Wd and 6bWo cxi WR. The two statements 
can be proven in the same way, and we only show the first one by considering 
the following two cases: 

Case 1: Wo > W. Then WoA \>WA, and on the other hand, 6aWo > WoA because 
both of them are suffixes of ^4°°. Hence 6aWo t>WA. 

Case 2: W> Wo- Then WAt>WoA, while WoA is a suffix of A°°, and thus WA is 
a suffix of A°°. This yields that WA txi 6aWo because both of them are suffixes 
of □ 

Corollary 2 . 1 . Let W G {A, BY- Then Wo > WoW, 6aWo > WoWH, ^^Wo > 
WoWR. In particular, 6aWo>WoA, 5feWo>WoR. 
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2.2. Natural decomposition and identifiability. 

Let ^ be a fixed sequence of a. Write ^ = .^ 1.^2 • ■ ■ • Since cr(.^) = we have the 
following so called “natural decomposition” of ^ 

(2.3) ^ — [^ 1^2 ■ ■ ■ C^n2-l] [Cn2 ■ ■ ■ ^ns-l] ‘ ' ' ' ' ' C^n^+i-l] ' ' ' ) 

where G A = {a, 6}, cr(^fc) = • • • Cn^+i-i e {^,B} {k > 1), and ni(: = 

!),••• ,nk{-= fc — 1])| + 1), • • • are called the “cutting positions” of We 

denote 

(2.4) El = {nk;k>l}. 

Now consider the factors of Let W = ^i^i+i • • • G then (comparing to 
(12.3p ) for some integers k, I {uk-i < i < Uk < ni < n^+i — 1 < j < 'n;+ 2 ), we have 

^ ■ ■ ■ ^rifc+i-l] ' ' ' ' ' ‘ ^n;+i-l] ' ' ' ^ji 

that is, observing the cutting positions of W in ^ we can write out the following 
natural decomposition of W 

(2.5) W = Ua{ik) ■ ■ ■ cT{ii)V = Ua{W')V, 
where 

U = ii---in^-i\>(T{ik-i), \U\ < |a(^fc_i)|, 

~ ^ni ■ ■ ■ 

= W---0<cr(a+i), IV^I < k(ez+i)|, 

w' = ^k--<iet 

We say that W (resp. , k < m < I ) is the ancestor of a{W') (resp. cr(^rn))- 
Sometimes, we also call ^k-i^k ■ ■ ■ ^i^i+i the ancestor of W. 

We extend a little more the significance of “natural decomposition”: if W = 
t/cT(iyihL'W 2 )ld as in (12.hh . we shall also say that W = U'a{W'')V' is a “natu¬ 
ral decomposition” (where U' = Ua{Wi),V' = cr(hL 2 )ld), and we write W = 
U'[a{W'')]V'. Equivalently, the notation U'[a{W'')]V' means that there exist 
U'\ V" G A* such that 

(2.6) U"W"V" G e, U't> a(t/"), and W < a{V''). 

Intuitively, U'[(j{W"y\V' appears in ^ with “[ ” and “ ]” showing the interested 
natural cutting positions. 

We call the decomposition as in fl2.5p a strict natural decomposition of W. 
Remark that any natural decomposition can be extended to a strict one, and, in 
general, the natural decompositions of a factor are not unique; and that the fact 
Ua{W)V G ^ does not always mean U[a(W)]V ! 
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From the theory of identifiability we have (recall that - ■ ■ ^j)'- 

Lemma 2.3. [H] There exists an integer C (depending on a) such that, ifWE(, 
can be written as W = (^[i — C,i + C] = (,[] — C,i + C] with i ^ Ei, then we have 
J e E^. 

We shall say that (^[i — C,i + C] and (,[j — C,j + C] have a relative common 
cutting position (at the positions i and j respectively). As a consequence, if W 
is long enough, say \W\> L with 

(2.7) L = max{2C + max{|A|, |S|}, |A| + \B\ - 1}(> |lFo|) 

and it appears at different positions in hF = ^[* 1 ,^ 2 ] = then roughly 

speaking, at the middle position of ^[AA 2 ] and ^[^ 1 ,^ 2 ], they have a relative 
common cutting position: for some integer N G (|hF|/2 — max{|A|, |i?|}, |hF|/2 + 
max{|A|, |i?|}), ii + N ^ El and ji + N ^ Ei. 

3. The Operator T and Structure of CS 
Dehne T : A* ^ A*: 

T{W) = Woa{W). 

Notice that T is not a morphism on A*. It is readily checked that T is injective 
and 

(3.8) T^{W) = Woa{Wo) • ■ ■ a'^-\Wo)a'^{W). 

Lemma 3.1. IfW then T{W) E Moreover, T{W) = Wo[a{W)]. 

Proof. Due to the primitivity of cr, the fixed sequence f is recurrent. Thus for 
any n G N, UW E f for some U E A* with |17| = n . Now by the cr-invariance of 
we have that a{U)a{W) E f. When the length n of t/ is large, ITo > <t(D) by 
Lemmatherefore T{W) = IFo[cr(IT)] E f. □ 

Lemma 3.2. Let Wi, W 2 E A*. Then T(lTi) = T{W 2 ) if and only if Wi = IT 2 ; 
T(ITi) < T{W 2 ) ^/ and only if Wi < W 2 ; T(Wi) > T{W 2 ) if and only if Wi > IF 2 . 

Proof. The first two easy statements hold since a is well marked, and the last 
one follows from Corollary 12.11 □ 

The following lemma tells us that if a factor W appears at two positions with 
different natural decompositions, then, up to a prehx W( > Wq, they have the 
same relative cutting positions. 

Lemma 3.3. Suppose that IT G |IT| > L with L defined in ^2. 7| ), and that 
W appears at two different positions in f, with W = Pi[a{Ui)]Qi and W = 
P 2 [o'{U 2 )]Q 2 the corresponding strict natural decompositions. Then, denoting by 
U the longest common suffix ofUi and U 2 and thus writing Ui = U[U, U 2 = U^U 
(where U[ or U 2 is possibly empty), we have that U is nonempty and 

(3.9) PMUi)Qi = W([a{U)]Q = P2a{U2)Q2, 
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where Q = Qi = Q 2 , = P 2 (j{U 2 ) txi Wq. More precisely, either 

Wq > Wo, or U[ = U 2 = e and Wq > a{6) for some 6 e A. 

Proof. By Lemma 12.31 the two strict natural decompositions share a relative 
cutting position, and thus all the cutting positions after this one. This implies 
that Ui and U 2 have nonempty common suffix, i.e., U is not empty. Also this 
implies that Qi = Q 2 , and consequently that Pia{U[) = P 2 (t{U 2 ) ixi Wq, where 
the last formula is due to Lemma 12.21 □ 


Lemma 3.4. (1) IfW G CS with \W\ > L. Then there exist unique U G A*, 5 G 
A and Q<a{6) with U6 E f and \Q\ < |o'((5)|, such that 

aW = aWo[a{U)]Q and bW = bWo[a{U)]Q. 

(2) If W E S with \W\ > L. Then there exist U E A*, IVq G A* with either 
Wq > Wo, or Wq > a{5) and |fLo| < |(t((5)| for some 6 E A, such that 

Wa = W;^[a{U)]a and Wb = W;^[a{U)]b. 

(3) If W E CS n S with \W\ > L. Then there exists a unique U E A* such 
that W = T{U). 


Remark: The word w in CS fl iS is called a bispecial word, which is developed 
in [S], see also [1]. 

Proof. (1) Consider the strict natural decompositions of aW and bW'. 

aW = aPa[aiUa)]Qa and bW = bPb[a{Ub)]Qb, 

with U the longest common suffix of [/„ and Ub, Ua = U'JJ, Ut = UjJJ. Then, as 
in the previous proof, U is nonempty, Qa = Qb, PaO’iU'a) = PbC^iU'f). Moreover, 
putting Wq = PaciU'g), we have that aWQt>a{Wa) and bWQt>a{Wifj with M4, Wb E 
A* and the last letters of Wa and Wb are distinct. Together with Lemma 12.21 
these facts imply that Wq = Wq. 

(2) The proof for this part is similar to the hrst part. 

(3) This is a corollary of the hrst two parts. □ 


Lemma 3.5. (1) Wo E CS; 

(2) Any prefix of a left special word is left special; 

(3) IfW E CS, then T{W) E CS. 

(4) Let W E CS with \W\ > L, then there exist unique U E f, 6 E 
such that W = Wo[a'{U)]Q = T{U)Q <T{W') (see Lemma\3^, where W' 
Further more, U, W' G CS. 


{a,b} 
= US. 


Proof. (1) and (2) are obvious. 

(3) . If aW E f, then T{aW) G ^ by Lemma [3Tl By Lemma 6aT(W) = 
daWoO'{W) is a suffix of T{aW) = WoAaiW), and thus 5aT{W) E f. From this, 
we see that W E CS implies T{W) E CS. 

(4) . It follows from the proof of the preceding lemma. □ 
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Now let 

L 

z;s = lj/:5i, TSn = {W]W <T^{W'),W' eTS}. 

i=l 

Remark that CSn is monotone with respect to n. The following theorem follows 
directly from the above lemma: 

Theorem 3.1. CS = = lim CSn- 

n—^oo 

Remark: The above theorem tells us that all left special words (which determine 
the complexity) can be obtained from a hnite set CS of left special words and by 
the operation T. 

4. Structure of S and Calculation of A'^p{n) 

Knowing the initial values, calculating p{n) boils down into calculating As{n + 
1) = Notice that any suffix of a special word is also special, hence 

if IK e iS„+i then W = 6W' for some W G Sn and 6 G {a, b}. Thus the set of 
special words can be visualized as a tree showing clearly how iS„+i derives from 
Sn (see the example and the hgure therein in the last section). 

As usual, for studying the special words’ tree, we shall use the following nota¬ 
tions for special words, see also [B]: 

Definition 4.1. Let W & S. If neither aW nor bW is in S, we say that W is a 
weak speeial word; If both aW and bW are in S, we say that W is a strong special 
word. We denote by S^ and S^ the set of weak special words and the strong weak 
special words respectively. The collection of other special words is denoted by S^. 

For i G {0,1, 2}, we write = 5* n It is clear that 

= 5° U 5^ U Si and 5 = 5° U U 5^ 

Lemma 4.1. (l)As{n + l)= s{n + 1) — s{n) = 4fSl — ifS^. 
(2)SlUSlcSnnCSn. 

Proof, (see Theorem 4.5.4 |6]) (1) and the fact that S^ C CSn are obvious. If a 
special word has only one left extension, then this left extension is also special. □ 

Lemma 4.2. Let c,d e A,W G If cWd G then ScT{W)d G Conversely, 
if 5fr{W)d G e and \T{W)\ > L, then cWd G e. 

Proof. If cWd G f, then by Lemma IXTl T{cWd) G i.e., WQa{c)a{W)a{d) G 
This together with Corollary 12.11 and the fact that a is well marked implies that 
5Mo(^{W)d = 5fr{W)d G e 

Conversely, if 5cT{W)d G f and |T(1K)| > L, then by Lemma 1331 we know that 
5cT{W)d = 5cWQ[a{W)\d is a natural decomposition. Considering the ancestor 
of 5cT{W)d, we know, again by Corollary 12.II and the fact that a is well marked, 
that cWd G □ 
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Lemma 4.3. IfW G S, thenTiW) G S (thus a{W) G S); furthermoreT{W)a = 
Wo[a{W)]a, and T{W)h = Wo[a{W)]h. 

Conversely ifWES and \ W\ > L, then there exists U E S such that W>T{U). 
Proof. Let W E S, then Wa, Wb E and by Lemma IXTl 

Wo[a{W)]A,Wo[a{W)]B E f. 

Recalling A = a® and B = b®, The first part of onr lemma is thus proved. 

The rest part is a restatement of Lemma [3.4r 2L □ 

We can say more on the structure of and iS°. 

Lemma 4.4. IfW E then T{W) E Conversely ifW E and \W\ > L, 
then there exists a unique U E such that W = T{U). 

Proof. Let W E S^. Then we have, by definition, that 

(4.10) aWa,aWb,bWa,bWbE^, 
and, by Lemma 14.21 that 

6^T{W)a, 6aT{W)b, 6bT{W)a, 6tT{W)b E e, 

i.e., T{W) E S^. The first part of the lemma is proved. 

Now suppose W E and |iy| > L. Then by Lemmas I4.1f 2i and l3.4f 3L 
W = Tim. Bv Lemma lOl U E 5T □ 

Lemma 4.5. IfW E and |T(1T)| > L, then T{W) E S^. Conversely if 
W E and \ W\ > L, then there exists a unique U E such that W = T{U). 

Proof. By Lemma [4.21 when |T(iy)| > L we know that cWd G ^ if and only if 
ScT(W)d G Whence W G 5° if and only if T(W) E 5°. The remaining proof 
is almost same with the corresponding part for the preceding Lemma. □ 

_ L 

Now denote 5^ = the set of strong special words of length less than L; 

i=l 

the set of the words W E S'^ such that |T(W)| > L. The sets and 5° are 
defined in a similar way. Let 

(4.11) S = ^U^ 
which will be considered as “initial special words”. 

Lemma 4.6. For any n > L, we have 

#5^= E and #5^= 

where S{i,j) is the Kronecker symbol: S{i,j) = 1 if i = j and = 0 otherwise. 
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Proof. Let U ^ S^. By Lemma [4.41 there exist k > 1 and PL G which are 
unique, such that U = T^{W). Conversely if |T*^(1L)| = n for some fc > 1, IL G 
52 , then T^(PL) G S^. Thus we have 

Si = {U]U = T\W), \T\W)\ = n, A; > 1 , IL G 52} 

where k and W in the representation U = T^{W) are uniquely determined by U. 
The hrst equality is thus proved. The second is proved similarly. □ 

The following formula then follows from the above lemma and Lemma 14.11 


Lemma 4.7. For any n > L, we have 
As{n + 1) = s{n + 1) — s(n) 

We 52 We s° 

It can be written as 

As(n + 1) = E_E sgn{W)6{\T\W)\,n), 

We s fc>i 


_ L 

where iS = IJ iSj the special words of length less than L), and 

i=l 


(4.12) 


sgn(lL) 


-1 tfW e<^ 

1 z/ IL G 52 

0 otherwise. 


Remark: 1. The function sgn(-) is equal to the bilateral multiplicity of a factor 
(i). See Theorem 4.5.4 |6] for more general cases. 

2. The above lemma tells us that the complexity p{n) can be computed knowing 
a hnite set S of special words. In the next section, we will hnd out a (non-linear) 
recurrence formula for the computation. 

5. Recurrence Formula for the Complexity 


Recall that M denotes the incidence matrix of a. Then is the inci¬ 
dence matrix of which possess non-negative eigenvalues. Since a and cr^ 
share the hxed sequence we may suppose without loss of generality that 
the eigenvalues of M is non-negative. 

Let Ai > A 2 > 0 be the two eigenvalues, Fi, V 2 be the corresponding eigenvec¬ 
tors. Since M is primitive, Ai > A 2 and F is positive. 

Recall that: for W G {a,b}*, L{W) = (|lF|a, \W\^)\ 

(5.13) |t^”(hF)| = (1,1)M^L(IL). 
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Lemma 5.1. Let X,Y E Then there exists N = N{X,Y) > 1 such that 
(1 ,— Y) (n E N) is of constant sign. That is, 

(1,> {resp. =, <) (1, 1)M^+'^Y for all ueN. 

Proof. Let X — Y = p,iVi + P. 2 V 2 where /ii, /i2 G M, then for k > 1, 

(1, 1)M’^{X -Y) = At/ii(l, l)lh + XWh 1)V^2. 

Case 1. Hi = 0. Then (1, 1)M^{X — Y) = A2/i2(l, 1)^2, which is obviously of the 
sign of X 2 H 2 {^, 1)^2 independent of fc > 1. 

Case 2. hi > 0- Since Ai > 0, (1, l)lAi > 0 and Ai > A2 > 0, there exists > 1 
such that for fc > we have Aj^/ii(l, l)Vi + A2/i2(l, 1)^2 > 0. 

Case 3. /ii < 0. The similar proof as Case 2. □ 

Corollary 5.1. Let fTi,lT 2 G A*. There exists N = N{Wi,W 2 ) such that 
|T'^+"'(fTi)| — |T^+"'(1T2)| (n G N) is of constant sign. This sign (called the 
final sign) will be denoted by SGN{hTi, 1T2}- 

Proof. The lemma follows directly from the above lemma and fl5.13p . □ 

In fact, we can say more: 

Corollary 5.2. Let fTi,iy 2 G A*. Then there exist mi = IT 2 ), m 2 = 

m 2 (Wi, IT2) G N such that one of the following alternatives holds: 

( 1 ) . |T™i(fTi)| = |T™2(tT2)| < |T™i+i(iyi)| = |T™2+i(lt^2)| < |T™i+2(fTi)| = 

\T^^+^{W2)\ < ... 

(2) . |T™i(fTi)| < \T’^^{W2)\ < |T™i+i(lTi)| < |T™2+1(PP2)| < \T^^^^{Wi)\ < 
\T^^+^{W2)\ < .... 

Proof. If SGN(T”^(iyi),T"'(iy2)) = 0 for some m,n E N, the alternative (1) 
holds. 

Otherwise, SGN(T™'(lTi), T”(1T2)) 7^ 0 for any m,n eN. We assume, without 
loss of generality, that SGN(lTi,IT2) = —1. Due to the primitivity, 11^2 is a 
factor of T\Wi) for I large enough, and it turns out that SGN(T^(ITi), IT2) = 1. 
Now clearly m 1—)■ SGN(T”*(ITi), IT2) is an increasing mapping from N onto 
{ —1,1}, therefore there exists m G N such that SGN(T™(lTi), IT2) = —1, while 
SGN(T™+^(Wi), IL2) = 1. Whence the alternative (2) holds for 

m2 = max{Ar(T’"(Wi), W 2 ), N{T^+\Wi), W 2 )}, and mi = m + m2. □ 

Now we can deduce from the above lemma the recurrence properties of the 
complexity. First let 5 = {5i, 5*2, • • • , Sk} and denote 

ni — 1 = max { max{mi(IFi, W 2 ),m 2 (Wi, W 2 )}; IFi, W 2 E S], 


where mi(IFi, IF2), m2(hFi, IF2) are defined in Lemma 15 ^ 
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We start from T'^^{Si). By Lemma 15.21 for each j = 2, 3, • • • , iL, there exists 
unique rij G N such that |T”i(S'i)| < \T^^{Sj)\ < |T"'i+^(S'i)|. Without loss of 
generality we may suppose that 

|T”i(^i)| < \T^\S2)\ < ■■■< \T^^{Sk) < |T"i+i(^i)|. 

Then for simplifying the notations let = |T-^ (T”''(S'fc))| {1 < k < K, j ^ M). 
We have by Lemma 15^ the following unison property for the “jumps of |T*(B4)|”: 

iVO < A^2 • • ■ < 

< Nl N]^ 

< N( 

(5.14) < <... 

Now we can formulate the recurrence formula of the complexity. Let X[m,n) de¬ 
note the indicator function of the integers’ interval [m, n). Let P = , G 

N. We see that P is the disjoint union of the subintervals (j G 

M, fc G {1, 2, • • • ,K}), where = Ni{j + 1). That is 

CO K 

!<«,) = ye, p = \jii. 

j=0 k=l 

5.1. Initial values of the complexity. 

Finally let Ck = sgn(S'i)(5(|T’**(5i)|, |T’"''(S'fc)|) {k = 1, - ■ ■ ,K), where sgn(-) 

i=l 

is dehned in fl4.12p . Then by Lemma 14.71 we have, As{n -f 1) = if n = 
\T^'^{Sk)\{k = 1, ■ ■ ■ , K) and = 0 otherwise. In other words, n 1 —)■ s(n-|-l) (n G P) 
is a step function with jumps Cfc at n = W(0) {k E {1,2, ■ ■ ■ , K}): 

K 

(5.15) s(n-h 1) = s(W(0)) + +-1" Cfc)x/o(n) (n G P), 

k=l 

5.2. Recurrence formula of s(- -I- 1) on P. 

K 

Notice that P = \J Pk (j £ N) can be calculated directly or by some easy 

k=l 

recurrence formula as described in the following: 

Proposition 5.1. ITe have for any W G A*, n G N, 

1. |a'^+2^iy)| = tr(M) |a'^+i(W)|-det(M) |cr”(W)|; 

|T”+2(W)| =tr(M) |T’^+1(W)|-det(M) |T^(iy)| + a, 
where a = |ct(1Lo)| — (tr(M) — l)|lTo|. 

A |a-(W)| = Aj‘/ii(l, l)Pi + A^/i2(l, 1)P2 tfLiW) = /iiW + yi2V2; 
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\T-{W)\ = A>i(l, + A>2(1, 1)V^2 + hn, 
where hn{n G N) zs a fixed sequence given explicitly by L{Wq) and M. 

Proof. All the results can be deduced easily from fl3.8l) . (15.131) and Cayley-Hamilton 
formula (with I denotes the identity matrix): = tr(M) M — det(M) I. □ 

We have just seen the recurrence properties of the intervals Ij {j G N). Still 
using Lemma W77 \ and the formula fl5.14p and we see that what happens for s(n+l) 
{n G Ifi j G N) is recurrently the same as s{n + 1) {n G /°), i.e., similar to fl5.15p 
we have proved the following 


Theorem 5.1. Let a be a well marked, primitive, non-periodic substitution hav- 

CO 


ing non-negative eigenvalues. Then for n G [N^, oo) 
recurrence formula holds: 


IJ P, the following 

j=0 


K 

s(n + 1) = s{N{) + ^(ci + • •• + Cfc)xji(n) (n G P,j > 0). 

k=l 


Remark: 1. The conditions “primitive, well marked, non-periodic, having non¬ 
negative eigenvalues” are non-essential as have already mentioned. 

2. s{N{'^^) — s{N{) = Cl -\- ■■■ + ck {j G N), which implies roughly s(A") 

n(ci ck) for large n. 

3. Although the above mentioned Nf can be more or less controlled in the 
proof of the theorem, but how to give efficiently this big integer N remains as an 
open problem. 

Fially let us give briefly an example: consider the substitution a = {aab, ba) 
i.e., a 1-4 aab, b i-4 ba. 

For this substitution, we have Wq = e and thus T = a. The incidence matrix 

/ 2 1 A 

M = { ^ ^ j and the characteristic polynomial is A^ — 3A -|- 1. The hxed point 
reads 

^ = aabaabbaaabaabbabaaabaabaabbaaabaabbabaaabbaaabaab■■■ 

The tree of the special words is depicted in Figured) 

The weak and strong special words (here is the identity map): 
iS° = {abaa, aabbaaabaab, • • • } = {a^{abaa)-,n = 0,1, 2, • • • }, 

= {e, a, aab, aabaabba, • • • } = {e} U {cr"'(a); n = 0, 1, 2, • • • }. 

From the structure of special words, the numbers of special words s{n) and the 
complexity p{n) read 
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aa 

ba 

1 


ab 

1 

baa 

1 

bba 

1 

1 

aab 

abaa 

1 

abba 

1 

aaab 

baab 


1 

aabba 

baaab 

abaab 


baabba 

abaaab 

aabaab 


abaabba 

babaaab 

aaabaab 


aabaabba 

bbabaaab 

baaabaab 


aaabaabba baabaabba 

abbabaaab 

bbaaabaab 


Figure 1 . Tree of Special Words 
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6 

7 

8 

9 

10 

11 

12 

13 

14 

15 


s{n) 

1 

X 

3 

3 

4 

3 

3 

3 

3 

4 

4 

4 

3 

3 

3 

3 


pin) 

X 

X 

4 

7 

10 

14 

17 

20 

23 

26 

30 

34 

38 

41 

44 

47 



We can formulate s{n) as 


' 1 if n = 0, 

, , 2 if n = 1, 

sin) = < 

I 3 if n e {2, 3} U Ufc>oM(^) + dik + 1)], 
.4 iinE[jf^^Q[g{k) + l,d{k)], 

where the number sequences g{k) and d{k) are dehned as 

g{k) = (1,1)M'=(2,1)*, d{k) = (1,1)M^(3, l)^ 

satisfying both the same recurrence: 

f g{k + 2) = 3g{k + l) - g{k), 

I d{k + 2) = 3d{k + 1) — d{k), 


with 5f(0) = 3, g{l) = 8 and (i(0) = 4, d{l) = 11. 
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